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Abstract. Given a vertex-labeled graph, each vertex v is attached with 
a label from a set of labels. The vertex-label query desires the length of 
the shortest path from the given vertex to the set of vertices with the 
given label. We show how to construct an oracle if the given graph is 
planar, such that 0{-nlogn) storing space is needed, and any vertex- 
label query could be answered in 0{j log n log p) time with stretch 1 + e. 
p is the radius of the given graph, which is half of the diameter. For the 
case that p = O(logn), we construct an oracle that achieves 0(log7i) 
query time, without changing the order of storing space. 



1 Introduction 

We consider those undirected graphs, in which each vertex is attached with a 
label from a set of labels, denoted by L. Fixed such a graph G = {V, E) and 
the label set L, the distance between two nodes u, u G V, denoted by 5{v,u) 
is the length of the shortest path connecting v and u in G, and the distance 
between a vertex u <^ V and a label X £ L, denoted by 6{u, A), is the distance 
between u and node v that is closest to u among all nodes with label A, i.e. 
6(u,X) = inhi{6(u,v)\v is attached with label A}. In the applications involving 
graphs, the query of vertex-label distance is often asked and used as a basic sub- 
procedure to achieve more complicated task. For example, a navigation software 
need to answer how far is the closest store for a specified service from the current 
position. Since these kind of questions raise very frequently, the answer should 
be returned with as less time as possible. Trivially, people could precalculate 
and store the answer for all possible queries. However, this may take too much 
space, which is of order 0(|1^| x \L\). 

The aim of distance oracle is to precalculate and store information using less 
than 0(|1^| X \L\) space, such that any distance query could be answered more 
efficiently than process the calculation based only on the graph structure. The 
approximate distance oracle answers the query with some stretch. In details, a 
distance oracle with stretch 1 -I- e returns d{u, A) as the approximation to S{u, A), 
such that S{u, A) < d{u, A) < (l-f e)S{u, A). 

Since in many practical cases, the given graph is drawn on a plane, it has 
wide applications to derive the particular distance oracle for planar graphs. 



1.1 Related Work 



Vertex-Label Distance Oracle. The problem of construct approximate distance 
oracles for vertex-labeled graphs was formalized and studied by Hermelin et 
al. in [?]. Let n denote the number of nodes and m denote the number of 
edges. They adapted the approximate scheme introduced by Thorup and Zwick 
in [TD] to show the construction of vertex-label distance oracles with expected 
size 0{kn^^k). stretch 4fc — 5 and query time 0{k). The preprocessing time is 
0{kmn'^). Let I = \L\. They also constructed vertex-label distance oracles with 
expected size 0{knlk), stretch 2'^ — 1 and query time 0{k). The preprocessing 
time is 0{kmn^^^). For a vertex in the graph, the associated label may change. 
A simple way to support label changes is to construct a new distance oracle. In 
[3] , they constructed vertex-label distance oracles with expected size 
stretch 2 • Z^^^ + 1 and query time 0(fc), which can support label changes in 
0{kn'^ \ogn) time. In [T], Chechik showed that Thorup and Zwick's scheme could 
also be modified to support label changes in 0(n'^ log^~'^ n log log n) time, with 
the expected size 0(n^"'"'?), stretch 4fc — 5 and query time 0{k). The preprocess- 
ing time is 0{kmnk). 

The vertex-label distance oracle has also been studied for some specified class 
of graphs. Tao et al. have shown how to construct vertex-label distance oracles 
for XML trees, in |8]. For the case that each node is assigned with exactly one 
label, their construction results in exact vetex-label distance oracles with size 
0(n), and query time O(logri). The preprocessing time is 0(nlogr7,). 

Vertex-Vertex Distance Oracle. In |10| . Thorup and Zwick have introduced a 
well-known scheme to construct vertex-vertex distance oracle with expected 
size 0(A:n^+fc ), stretch 2k — 1, and query time 0{k). The preprocessing time is 
0{kmnk). Wulff-Nilsen in jTJ has improved the preprocessing time to 0{y/km-\- 
kn^~^^) for some universal constant c, which is better than 0{kmni) except for 
very sparse graphs and small k. For planar graphs, Klein in [S] has shown how 
to construct vertex-vertex distance oracles with size 0(-iri, log n), stretch 1 + e 
and query time 0(1). 

Shortest Path. The construction of distance oracles often harness the shortest 
path algorithms in preprocessing stage. Although it is better to know as well as 
possible the methods that aim at calculating the shortest path, we only selected 
the most related ones and list them here. For the others, we will introduce them 
while they are used in our algorithm. 

A shortest path tree with vertex w is a tree rooted at v and consisting of all 
nodes and a subset of edges from the given graph, such that for any u in the given 
graph, the path from v to u in the tree is the shortest path from w to m in the 
original graph. Given a single vertex, to calculate the shortest path tree rooted at 
it is called single source shortest path problem. In undirected graphs, the single 
source shortest path tree could be calculated in time of order 0{m) where m 
is the number of the edges in the given graph. The algorithm is introduced by 
Throup in |n|. In directed graphs, the single source shortest path tree could be 
calculated in time of order 0(rn -\- nlogn) where n is the number of the nodes 



in the given graph. This is done by the well known Dijkstra algorithm using 
Fibonacci heap [2]. 

1.2 Simple Solution in Doubling Metrics Spaces. 

If the metric implied by the given graph is doubling, the following procedure 
provides a simple solution to return S{u, A) with (1 + e)-stretch. 

Preprocessing. Let e' = |. For e < 1, (1 + e')^ < 1 + e. For each label X G L, 
construct the oracle to support (1 + e')-nearest neighbor search. In additional, 
construct the oracle to support (1 + e') vertex-vertex distance query. 

Query. Given u G V and X G L, find the (1 + e')-NN of u among the nodes with 
label A, and then query for their (1 + e') distance. 

Space and Query Time. The oracle supporting approximate nearest neighbor 
search for A could be constructed using 0{nx) space , where Ui is the number of 
nodes with label A. Since any node is allowed to attached with only one label, 
then the space used in all is 0{n). This kind of oracle could answer the query 
in 0{logn\) ~ O(logn) time . The oracle supporting approximate vertex-vertex 
query distance could be constructed using 0{n) space and answer the query in 
0(1) time . 

1.3 Our Contribution 

As shown in the subsection of related work, we are not aware of any vertex-label 
distance oracle on planar graphs. In this paper, we mainly show the following 
two results. 

Theorem 1. Given an undirected planar graph G ~ {V, E) and a label set L, 
each vertex v G V is attached with one label in L. For any < e < 1, there 
exists an oracle that could answer any vertex-label query with stretch 1 -|- e, in 
0(i lognlogp) time. The oracle needs 0(^71 log?i) space. 

Theorem 2. Given an undirected planar graph G = (V, E) and a label set L, 
each vertex v G V is attached with one label in L. If the radius of G is of 
order 0(log?i), then for any < e < 1, there exists an oracle that could answer 
any vertex-label query with stretch 1 + e, in O(ilogri) time. The oracle needs 
0(i-n log n) space. 

2 Preliminary 

Lipton Tarjan Separator. JEj Let T be a spanning tree of a planar embedded 
triangulated graph G with weights on nodes. Then there is an edge e ^ T, s.t. 
the strict interior and strict exterior of the simple cycle in TU {e} each contains 
weight no more than | of the total weight. 



Recursive Graph Decomposition J^. The recursive graph decomposition (RGD) 
of a given graph G is a rooted tree, such that each vertex p in G maintains 

— a set N{p) of nodes in G, in particular the root of RGD maintains (as a 
label) Nip) = V{G), and 

— p is a leaf of RGD iff. N{p) contains only one node of G, in this case let 
S{p) = N{p); 

— if p is not a leaf of RGD, it maintains (as a label) an a-balanced separator 
S(j)) of G, balanced with respect to the weight assignment in which each 
node in N{p) is assigned weight 1 and other nodes are assigned weight 0. 

A non-leaf vertex v of the tree has two children pi and p2 , such that 

— N{pi) = v e N{p) n ext{S{p)), and 

— Nip2) = V e Nip) n intiSip)), 

where S denotes the cycle corresponding to a separator S. For a leaf node p of 
RGD, Nip) contains only one node of G. In practice, Nip) may contain a small 
number of nodes, such that the distances in the subgraph induced by Nip) for 
every pair of nodes in Nip) are pre-calculated and stored in a table support 0(1) 
time look-up. 

Range Minimum Query. The range minimum query problem is to preprocess 
an array of length n in 0(n) time such that all subsequent queries asking for 
the position of a minimal element between two specified indices can be answered 
quickly. This can be done in constant time using no more than 2n + o(ri,) bits 
0. 

2.1 Notation 

Projection. Given a set of nodes S, and a node v, we define the projection of v 
on S as the node in S that is closest to v. 

Radius A graph has radius p iff. it has a shortest path tree with at most r levels. 

3 (1 + e)-Stretch, 0(Mognlog/o) Query Time Oracle 
3.1 Preprocessing 

Find the node node r E G whose shortest path tree has p levels, compute the 
shortest-path tree T in G rooted at r and based on T. Calculate the RGD. Then 
store, 

1. a table records, for each node u G G, the leaf node p gRGD, s.t. v G Nip); 

2. a table records, for each node p SRGD, the depth of p in RGD; 

3. a representation of RGD support quick (0(1) time) computation of lowest 
common ancestor (lea); 



4. a table Ty for each node v G G records, for eaclip gRGD such that v S N{p), 
two sub-tables for the paths P', P" in the separator S'(p), respectively. In 
details, Ty[p\[P'] (similar for Tt,[p][P"]) consists of a sequence oi 0{^) pairs 
(d-q, h^q), . . . , (do, ho), . . . , {dtn, hw), where di is the distance from t; to a 
node Zi on P' and hi is the distance from z,; to r (root of the shortest path 
tree T), such that the sequence has the distance property: for any node w on 
P' . there is a node Zi such that the distance from v to Zi plus the distance 
from Zi to is at most (1 + e) times the distance from v to w. Refer to the 
nodes Zi as portals, and to the corresponding di as portal distances. In [5] , it 
is proved that 0{^) portals is enough to promise the distance property. To 
be self-contained, we include in the appendix a simple version of the proof 
(See Appendix El . 



Parts 1 to 3 need 0{n) space. Part 4 needs O(-logri) space. In addition, we 
store the portals for each label A. In details, we store 



5. a table T\ for each label A S i, in which there is an entry for each piece 
p gRGD, s.t. N{p) n V{X) 7^ 0. In each entry, it stores two sequence for the 
paths P' and P" forming S{p), respectively. In details, the sequence for P' 
stores all the portals on P' for nodes in N{v) n V'(A), in the increasing order 
according to their distances from the root r. 



6. a hash table for each label A indicates whether there is an entry for a given 
separator in T\ and return the index in T\ if yes (both operation could be 
done in constant time). 



Part 5 needs O(nlogn) space in total. Part 6 needs 0(?? logn) space in total, 
since T\ contains one separator S{p) iff. there exists at least one node with label 
A in N{p). 



3.2 Query 

Given a node u E G and a label A e L, do as Algorithm [1] 
Input; u, A 

Initialization: d{u, A) oo 

for Each p CzRGD s.t. u G N[p) and T\ has an entry for p do 
for Each path P of S{p) do 

for Each path portal Zu of u on P do 

(7+ {A's portals on P that is farther or equal than 2„ from 
r} 

{z~^,v~^} ^ the portal of some A labeled node v that achieves 
min{i5(u, Zy) + h{zy)} over , and v 

C~ {A's portals on P that is closer or equal than z„ from r} 
{z~,v~} ^ the portal of some A labeled node v that achieves 
min{(5(w, z^) — h{zy)} over C~, and v 
d' ^ 

{5[u,Zu) + 5[zu,z+) + 5{v,z+),5{u,Zu) + (S(z„,z~) + 5{v,z')} 
d{u, A) ^ min{(i', d{u, A)} 
end 
end 
end 

Output: d{u, A) 

Algorithm 1: 

Lemma 1. Given u, A, let v be the A labeled node satisfying 5{u,v) = S{u,X). 
There exist a portal z„ of u and a portal of v on the same path P , such that 
5{u, Zu) + 5{zu., Zy) + 5{zy,v) < (1 + e)5{u, A). 

Proof. Let p„, py be the lowest pieces in RGD containing u, v, respectively, i.e. 
u £ N{pu) and v S N{py). Let p^y be the lea of pu and py in RGD. Then 
u S N{puv), V € N{puy), and the shortest path from w to w crosses with S{puy)- 
Denote the crossing point as c. There exists a it's portal z„, such that 6{u, Zy) + 
5{zu,c) < (1 + e)(5(M, c), and a -y's portal z„, such that 5[v,Zy) + 5{zy,c) < 
{l + e)5{v,c). 

Hence 5{u, Zy) + 5{zu, z„) + 5{zy, v) < {I + e)5{u, A). 

This lemma implies that the output of Algorithm[T]achieves the (l+e)-approximation 
to A), since 

— if Zy is farther than z„ from r, then h{zy) + 5{zy^v) > h{z^) + 5{z^ ,v^)), 
and hence 

d{u, Zy) + S{zy, z+) + z+) 

< d{u, Zy) + S{Zy, Zy) + S {v , Zy) 

< (l + e)<5(u,A); 



— if Zy is closer than z„ from r, then ^h(zy) + S{zy,v) > —h{z ) +5{z ,v )), 
and hence 

S{u, Zy) + S{zu, z~) + 5{v~ ,z~) 

< S{U, Zy) + (5(Z„, Zy) + d{v, Zy) 

< {l + e)S{u,X). 

To show the query time 0(i lognlogp), we only need to show that {v~) 
and z+ {z") could be found in 0{\ogp) time. Actually, this could be done by 
identifying the range of C"*" (C~ ) of the portals of A on the specified path, using 
0{logp) time, and locating {v~) by range minimum query, using 0(1) time. 

Theorem 1. Given an undirected planar graph G = (V, E) and a label set L, 
each vertex v ^ V is attached with one label in L. For any < e < 1, there 
exists an oracle that could answer any vertex-label query with stretch 1 + e, in 
0(i- log n logp) time. The oracle needs 0{^n\ogn) space. 

3.3 3-Stretch, O (log n log p)- Query Time Oracle 

Consider the case that e = 2. The oracle supports the 3-stretch, O(lognlgp)- 
query time, using space 0(n log n). The space, query time product (suggested by 
Christian Sommer [7]) is 0(nlog^ nlogp), which is better than 0(ri5) x 0(1) 
for general graphs. 

Note that in this case, each node u has only one portal on a specified path 
of a separator, which is the projection, denoted by i.e. the node on the path 
closest to u. The reason is for any node z on the path, we have (5(z, z„) < 
6{u, Zy) + (5(u, z) < 2(5(u, Zy), and hence S{u, z) < d{u, z„) + S{zy, z) < SS{u, Zy). 

4 0(1) Time to Identify C+ (C") when p = O(logn) 

In the case that p ~ 0(logri,), the time to identify C+ (C~) is O(loglogn). We 
show that this could be improved to 0(1). 

At first, note that when we store the portals for a label A, it is possible that 
a node servers as the portals for different nodes. It is obvious that we can only 
store the one with the minimum portal-node distance. Thus fixed a label A, on a 
path of a separator, each node serves as at most one portal of A. Using a word of 
p = O(logn) bits, denoted by uj, it can be identified whether a node on the path 
is a portal, i.e. the i-th bit is 1 iff. the i-th node on the path is a portal for A. If 
the portals on a path for A are stored in the increasing order of their positions on 
the path, its index could be retrieved by counting how many 1 there are before 
the i-th position of u. Since any operation on a single word is assumed to cost 
0(1) time, we achieve the 0(1) time method to identify C+, with 

— O(nlogn) space to record the position on the path forming separator, for 
each portal; and 



— 0{n\ogn) space to store oj's for all labels. 

Theorem 2. Given an undirected planar graph G = {V, E) and a label set L, 
each vertex v Cz V is attached with one label in L. If the radius of G is of 
order O(logn), then for any < e < 1, there exists an oracle that could answer 
any vertex-label query with stretch 1 + e, in 0{- \ogn) time. The oracle needs 
0(i-n log n) space. 

5 Label Changes 

We consider the cost to update the oracle, if a node v changes its label from 
Ai to A2. The portals of v are not affected. However, the portals of Ai and A2 
should be change. 

To remove the portals of v from the portals of Ai , it requires to change the 
hash table indicating of whether a separator is related to Ai for at most once, 
and change the portal sequences of Ai for at most O(logn) separators. 

To add the portals of v to the portals of A2, it requires to change the hash 
table indicating of whether a separator is related to A2 for at most once, and 
change the portal sequences of A2 for at most O(logn) separators. 

6 Application 

Nearest Neighbor Search for Multiple Sets. Given a set V{X) of nodes, it is trivial 
to construct a linear size (0(n)) oracle to support the query the nearest neighbor 
in V'(A) for a query node u, i.e. the closest node to u in V{X). However, if there 
are several such sets {V{Xi)}\-tzL, this trivial method needs 0{\L\ ■ n) space, 
which may be as big as 0{n'^) even each node is associated with only one label. 
Using the oracle introduced in this document, we may construct an oracle using 
0(-!-nlogn) space to support the query of (1 + e)-NNS between a node and a 
label in 0{- log n log p) time. 

Let's consider the case that each node in the graph could be associated with 
more than one labels. In this case, K = XIa ei I^C'^OI could be bigger than n, 
the oracle introduced here needs 0{-K \ogn) space. 

Note that 0(|i|n) = Ea.gl and O(iA'logn) = Ea.gl 0(i|T^(A,)| logn). 
Hence the method introduced here is more efficient on space if 7|l^(Ai)| ~ <^{i^^) 
for all Xi £ L. 
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Appendix A: Finding Portals to Promise Distance Property 



Lemma 2. For each node v and each path P' , there exists a set {zi} of size less 
than 4(e — for which the distance condition is satisfied. 

Proof. Let zq be the node on P' that is closest to the node v and then we choose 
the remaining portals Zi in two phases. 

— Phase 1. In this phase, we choose a set of nodes Zi that are closer than zq 
to the root r, using Algorithm [51 Define a node in z on P' to be a candidate 
with respect to (w.r.t.) i iff. 

1. z is closer tot eh root than z^+i, and 

2. S{v, z) <{l + e)-\S{v, z,+i + d{r, z,+i) + S{r, z)). 

Initialization: i i 1 

while 3 candidates w.r.t i do 

Zi candidate z that is farthest from r 

z <— i - I 
end 

Algorithm 2: 

Note the invariant for Phase 1 : for i < and any node h lying strictly between 
Zi and Zi+i on P' ,wehave 6{v, Zi+i)+6{zi+i,h) < {l+e)d{v,h). In particular, 
if there is no candidate w.r.t. k, then for any node h lying strictly between 
the root r and Zi+i on P' , we have d{v^ -Zj+i) + S{zi^i, h) < {1 + e)S{v, h). 

— Phase 2. In this phase, we choose a set of nodes Zi that are farther than zq 
to the root r, using Algorithm [31 Define a node in z on P' to be a candidate 
with respect to (w.r.t.) i iff. 

1. z is farther to the root than z^+i, and 

2. S{v, z)<{l + e)-\S{v, z,_i + (5(r, z) + S{r, 

Initialization: i ^ 1 
while 3 candidates w.r.t i do 

Zi ^ candidate z that is closest from r 

i ^ i + 1 
end 

Algorithm 3: 

Note the invariant for Phase 2: for i > and any node h lying strictly 
between Zi and 2i_i on P' , we have (5(u, Zi-i) + d{zi-i, h) < (1 + e)(5(v, h). In 
particular, if there is no candidate w.r.t. k, then for any node h lying beyond 
on P', we have S{v, + (5(zi_i, h) < (1 + e)(5(u, h). 

Clearly, the {zi} chosen satisfies the distance condition. It remains to show that 
the number of z,; chosen is 0{^). We show the analysis for z > and it applies 
to the case of i < in the similar way. 
Since 



d{v,Zi) < (1 + e) ^{6{v, Zi-i) + d{r, Zi) - 6{r, z^-l)) 



< (1 + er^S{v, + S{r, z,) - S{r, 

< 5{v, Zi^i) - (e - e^)S{v, + S{r, Zi) - 5{r, 

< 5{v, Zi_i) - (e - e^)5{v, zq) + (5(r, z^) - 5{r, Zi_i), 

then 

Zi) - 6{zi, r) < S{v, Zi_i) - (5(r, Zi_i) - (e - e^)6{v, zq) 
< 6{v, zq) - S{r, zo) - i(e - e^)S{v, zq) 

Noting S{v,Zi) — S{zi,r) > —S{v,zo) — S{r,zo), it follows that i < 2(e 
This implies the lemma. 



